\chapter{Shor's algorithm}
\label{ch:shors_algorithm}
OK, now for Shor's Algorithm:
how to factor $M = pq$ in $O\left( (\log M)^2 \right)$ time.

This is arguably the reason agencies such as the US's National
Security Agency have been diverting millions of dollars toward
quantum computing.

\section{The classical (inverse) Fourier transform}
The ``crux move'' in Shor's algorithm is the so-called
quantum Fourier transform.
The Fourier transform is used to extract \emph{periodicity} in data,
and it turns out the quantum analogue is a lot faster than the classical one.

Let me throw the definition at you first.
Let $N$ be a positive integer, and let $\omega_N = \exp\left( \frac{2\pi i}{N} \right)$.
\begin{definition}
	Given a tuple of complex numbers
	\[ \left( x_0, x_1, \dots, x_{N-1} \right) \]
	its \vocab{discrete inverse Fourier transform} is
	the sequence $(y_0, y_1, \dots, y_{N-1})$ defined by
	\[ y_k = \frac1N \sum_{j=0}^{N-1} \omega_N^{jk} x_j. \]
	Equivalently, one is applying the matrix
	\[
		\frac 1N
		\begin{bmatrix}
			1 & 1 & 1 & \dots & 1 \\
			1 & \omega_N & \omega_N^2 & \dots & \omega_N^{N-1} \\
			1 & \omega_N^2 & \omega_N^4 & \dots & \omega_N^{2(N-1)} \\
			\vdots & \vdots & \vdots & \ddots & \vdots \\
			1 & \omega_N^{N-1} & \omega_N^{2(N-1)} & \dots & \omega_N^{(N-1)^2}
		\end{bmatrix}
		\begin{bmatrix} x_0 \\ x_1 \\ \vdots \\ x_{N-1} \end{bmatrix}
		=
		\begin{bmatrix} y_0 \\ y_1 \\ \vdots \\ y_{N-1} \end{bmatrix}.
	\]
\end{definition}
The reason this operation is important is because it lets
us detect if the $x_i$ are periodic.
More generally, given a sequence of $1$'s appearing with period $r$,
the amplitudes will peak at inputs which are divisible by $\frac{N}{\gcd(N,r)}$.
Mathematically, we have that
\[
	x_k = \sum_{j=0}^{N-1} y_j \omega_N^{-jk}.
\]

\begin{example}
	[Example of discrete inverse Fourier transform]
	Let $N = 6$, $\omega = \omega_6 = \exp(\frac{2\pi i}{6})$
	and suppose $(x_0,x_1,x_2,x_3,x_4,x_5)=(0,1,0,1,0,1)$
	(hence $x_i$ is periodic modulo $2$).
	Thus,
	\begin{align*}
		y_0 &= \tfrac16\left(\omega^0 + \omega^0+ \omega^0\right) = 1/2 \\
		y_1 &= \tfrac16\left(\omega^1 + \omega^3 + \omega^5\right) = 0 \\
		y_2 &= \tfrac16\left( \omega^2 + \omega^{6} + \omega^{10} \right) = 0 \\
		y_3 &= \tfrac16\left( \omega^3 + \omega^9 + \omega^{15} \right) = -1/2 \\
		y_4 &= \tfrac16\left( \omega^4 + \omega^{12} + \omega^{20} \right) = 0 \\
		y_5 &= \tfrac16\left( \omega^5 + \omega^{15} + \omega^{25} \right) = 0.
	\end{align*}
	Thus, in the inverse transformation the ``amplitudes''
	are all concentrated at multiples of $3$;
	thus this reveals the periodicity of the original
	sequence by $\frac N3 = 2$.
\end{example}
\begin{remark}
	The fact that this operation is called the ``inverse''
	Fourier transform is mostly a historical accident
	(as my understanding goes).
	Confusingly, the corresponding quantum operation is the
	(not-inverted) Fourier transform.
\end{remark}
If we apply the definition as written, computing the transform takes $O(N^2)$ time.
It turns out that by a classical algorithm called the \vocab{fast Fourier transform}
(whose details we won't discuss, but it effectively ``reuses'' calculations),
one can reduce this to $O(N \log N)$ time.
However, for Shor's algorithm this is also insufficient;
we need something like $O\left( (\log N)^2 \right)$ instead.
This is where the quantum Fourier transform comes in.

\section{The quantum Fourier transform}
Note that to compute a Fourier transform, we need to multiply an $N \times N$ matrix
with an $N$-vector, so this takes $O(N^2)$ multiplications.
However, we are about to show that with a quantum computer,
one can do this using $O( (\log N)^2 )$ quantum gates when $N = 2^n$,
on a system with $n$ qubits.

First, some more notation:
\begin{abuse}
	In what follows, $\ket{x}$ will refer to
	$\ket{x_n} \otimes \ket{x_{n-1}} \otimes \dots \otimes \ket{x_1}$
	where $x = x_n x_{n-1} \dots x_1$ in binary.
	For example, if $n = 3$
	then $\ket{6}$ really means $\ket1 \otimes \ket1 \otimes \ket 0$.
	Likewise, we refer to $0.x_1x_2 \dots x_n$ as binary.
\end{abuse}
Observe that the $n$-qubit space now has an
orthonormal basis $\ket0$, $\ket1$, \dots, $\ket{N-1}$

\begin{definition}
	Consider an $n$-qubit state
	\[ \ket\psi = \sum_{k=0}^{N-1} x_k \ket{k}. \]
	The \vocab{quantum Fourier transform} is defined by
	\[
		\UQFT(\ket\psi) = \frac{1}{\sqrt N}\sum_{j=0}^{N-1}
		\left( \sum_{k=0}^{N-1} \omega_N^{jk} x_k \right) \ket{j}.
	\]
	In other words, using the basis $\ket0$, \dots, $\ket{N-1}$,
	$\UQFT$ is given by the matrix
	\[
		\UQFT = \frac{1}{\sqrt N}
		\begin{bmatrix}
			1 & 1 & 1 & \dots & 1 \\
			1 & \omega_N & \omega_N^2 & \dots & \omega_N^{N-1} \\
			1 & \omega_N^2 & \omega_N^4 & \dots & \omega_N^{2(N-1)} \\
			\vdots & \vdots & \vdots & \ddots & \vdots \\
			1 & \omega_N^{N-1} & \omega_N^{2(N-1)} & \dots & \omega_N^{(N-1)^2}
		\end{bmatrix}
	\]
\end{definition}
This is the exactly the same definition as before,
except we have a $\sqrt N$ factor added so that $\UQFT$ is unitary.
But the trick is that in the quantum setup, the matrix can be rewritten:
\begin{proposition}
	[Tensor representation]
	Let $\ket x = \ket{x_n x_{n-1} \dots x_1}$.
	Then
	\begin{align*}
		\UQFT( \ket{x_n x_{n-1} \dots x_1} )
		= \frac{1}{\sqrt N} &
		\left( \ket0 +\exp(2\pi i \cdot 0.x_1) \ket 1 \right) \\
		&\otimes \left( \ket0 +\exp(2\pi i \cdot 0.x_2x_1) \ket 1 \right) \\
		&\otimes \dots  \\
		&\otimes \left( \ket0 +\exp(2\pi i \cdot 0.x_n\dots x_1) \ket 1 \right)
	\end{align*}
\end{proposition}
\begin{proof}
	Direct (and quite annoying) computation.
	In short, expand everything.
\end{proof}

So by using mixed states, the quantum Fourier transform
can use this ``multiplication by tensor product'' trick that isn't possible classically.

Now, without further ado, here's the circuit.
Define the rotation matrices
\[ R_k = \begin{bmatrix} 1 & 0 \\ 0 & \exp(2\pi i/2^k) \end{bmatrix}. \]
Then, for $n=3$ the circuit is given by using controlled $R_k$'s as follows:
\[
	\Qcircuit @C=1em @R=.7em {
		\lstick{\ket{x_3}} & \gate{H} & \gate{R_2} & \qw & \gate{R_3}  & \qw & \qw & \rstick{\ket{y_1}} \\
		\lstick{\ket{x_2}} & \qw & \ctrl{-1} & \gate{H} & \qw & \gate{R_2} & \qw & \rstick{\ket{y_2}} \\
		\lstick{\ket{x_1}} & \qw & \qw & \qw & \ctrl{-2} & \ctrl{-1} & \gate{H} & \rstick{\ket{y_3}} \\
	}
\]
\begin{exercise}
	Show that in this circuit, the image of $\ket{x_3x_2x_1}$ is
	\[
		\Big(\ket0+\exp(2\pi i \cdot 0.x_1) \ket1\Big)
		\otimes \Big(\ket0+\exp(2\pi i \cdot 0.x_2x_1) \ket1\Big)
		\otimes \Big(\ket0+\exp(2\pi i \cdot 0.x_3x_2x_1) \ket1\Big)
	\]
	as claimed.
\end{exercise}

For general $n$, we can write this as inductively as
\[
	\Qcircuit @C=1em @R=.7em {
	\lstick{\ket{x_n}}     & \multigate{5}{\text{QFT}_{n-1}} & \gate{R_n} & \qw                    & \qw & \cdots & & \qw                    & \qw & \cdots & & \qw              & \qw      & \rstick{\ket{y_1}} \qw \\
	\lstick{\ket{x_{n-1}}}     & \ghost{\text{QFT}_{n-1}}        & \qw                    & \gate{R_{n-1}} & \qw & \cdots & & \qw                    & \qw & \cdots & & \qw              & \qw      & \rstick{\ket{y_2}} \qw \\
	\lstick{\vdots\ \ }    & \pureghost{\text{QFT}_{n-1}}    &                        &                        &     &        & &                        &     &        & &                  &          & \rstick{\ \ \vdots} \\
	\lstick{\ket{x_i}}     & \ghost{\text{QFT}_{n-1}}        & \qw                    & \qw                    & \qw & \cdots & & \gate{R_i} & \qw & \cdots & & \qw              & \qw      & \rstick{\ket{y_{n-i+1}}} \qw \\
	\lstick{\vdots\ \ }    & \pureghost{\text{QFT}_{n-1}}    &                        &                        &     &        & &                        &     &        & &                  &          & \rstick{\ \ \vdots} \\
	\lstick{\ket{x_2}} & \ghost{\text{QFT}_{n-1}}        & \qw                    & \qw                    & \qw & \cdots & & \qw                    & \qw & \cdots & & \gate{R_2} & \qw      & \rstick{\ket{y_{n-1}}} \qw \\
	\lstick{\ket{x_1}}     & \qw                             & \ctrl{-6}               & \ctrl{-5}               & \qw & \cdots & & \ctrl{-3}               & \qw & \cdots & & \ctrl{-1}         & \gate{H} & \rstick{\ket{y_n}} \qw
}
\]
\begin{ques}
	Convince yourself that when $n=3$ the two circuits displayed are equivalent.
\end{ques}

Thus, the quantum Fourier transform is achievable with $O(n^2)$ gates,
which is enormously better than the $O(N \log N)$ operations achieved by
the classical fast Fourier transform (where $N=2^n$).

\section{Shor's algorithm}
The quantum Fourier transform is the key piece of Shor's algorithm.
Now that we have it, we can solve the factoring problem.

Let $p,q > 3$ be odd primes, and assume $p \neq q$.
The main idea is to turn factoring an integer $M = pq$ into a problem
about finding the order of $x \pmod M$; the latter is a ``periodicity''
problem that the quantum Fourier transform will let us solve.
Specifically, say that an $x \pmod M$ is \emph{good} if
\begin{enumerate}[(i)]
	\ii $\gcd(x,M) = 1$,
	\ii The order $r$ of $x \pmod M$ is even, and
	\ii Factoring $0 \equiv (x^{r/2}-1)(x^{r/2}+1) \pmod M$,
	neither of the two factors is $0 \pmod M$.
	Thus one of them is divisible by $p$, and the other
	is divisible by $q$.
\end{enumerate}
\begin{exercise}
	[For contest number theory practice]
	Show that for $M = pq$ at least half of the residues
	in $\Zm M$ are good.
\end{exercise}

So if we can find the order of an arbitrary $x \in \Zm M$,
then we just keep picking $x$ until we pick a good one
(this happens more than half the time);
once we do, we compute $\gcd(x^{r/2}-1,M)$ using the Euclidean
algorithm to extract one of the prime factors of $M$, and we're home free.

Now how do we do this?  The idea is not so difficult:
first we generate a sequence which is periodic modulo $r$.
\begin{example}[Factoring $77$: generating the periodic state]
	Let's say we're trying to factor $M = 77$,
	and we randomly select $x = 2$, and want to find its order $r$.
	Let $n = 13$ and $N = 2^{13}$, and start by initializing the state
	\[ \ket\psi = \frac{1}{\sqrt N} \sum_{k=0}^{N-1} \ket k. \]
	Now, build a circuit $U_x$ (depending on $x$)
	which takes $\ket k \ket 0$ to $\ket k \ket{x^k \bmod M}$.
	Applying this to $\ket\psi \otimes \ket0$ gives
	\[ U(\ket\psi\ket0) =
		\frac{1}{\sqrt N} \sum_{k=0}^{N-1} \ket k \otimes \ket{x^k \bmod M}. \]
	Now suppose we measure the second qubit, and get a state of $\ket{128}$.
	That tells us that the collapsed state now, up to scaling, is
	\[ (\ket{7} + \ket{7+r} + \ket{7+2r} + \dots) \otimes \ket{128}. \]
\end{example}
The bottleneck is actually the circuit $U_x$;
one can compute $x^k \pmod M$ by using repeated squaring,
but it's still the clumsy part of the whole operation.

In general, the operation is:
\begin{itemize}
	\ii Pick a sufficiently large $N = 2^n$ (say, $N \ge M^2$).
	\ii Generate $\ket\psi = \sum_{k=0}^{2^n-1} \ket{k}$.
	\ii Build a circuit $U_x$ which computes $\ket{x^k \mod M}$.
	\ii Apply it to get a state
	$\frac{1}{\sqrt N} \sum_{k=0}^{2^n-1} \ket k \otimes \ket{x^k \mod M}$.
	\ii Measure the second qubit to cause the first qubit to
	collapse to something which is periodic modulo $r$.
	Let $\ket\phi$ denote the left qubit.
\end{itemize}

Suppose we apply the quantum Fourier transform to the left qubit $\ket\phi$ now:
since the left bit is periodic modulo $r$, we expect the transform
will tell us what $r$ is.
Unfortunately, this doesn't quite work out, since $N$ is a power of two,
but we don't expect $r$ to be.

Nevertheless, consider a state
\[ \ket\phi = \ket{k_0} + \ket{k_0+r} + \dots \]
so for example previously we had $k_0=7$ if we measured $128$ on $x=2$.
Applying the quantum Fourier transform, we see that the
coefficient of $\ket j$ in the transformed image is equal to
\[
	\omega_N^{k_0j} \cdot
	\left( \omega_N^{0} + \omega_N^{jr} + \omega_N^{2jr}
	+ \omega_N^{3jr} + \dots \right)
\]
As this is a sum of roots of unity, we realize we have
destructive interference unless $\omega_N^{jr} = 1$ (since $N$ is large).
In other words, we approximately have
\[
	\UQFT(\ket\phi)
	\approx
	\sum_{\substack{0 \le j < N \\ jr/N \in \ZZ}} \ket j
\]
up to scaling as usual.
The bottom line is that
\begin{moral}
	If we measure $\UQFT\ket\phi$ we obtain a $\ket j$ such that
	$\frac{jr}{N}$ is close to an $s \in \ZZ$.
\end{moral}
And thus given sufficient luck we can use continued fractions
to extract the value of $r$.

\begin{example}
	[Finishing the factoring of $M = 77$]
	As before, we made an observation to the second qubit,
	and thus the first qubit collapses to the state
	$\ket\phi = \ket7 + \ket{7+r} + \dots$.
	Now we make a measurement and obtain $j = 4642$, which means that
	for some integer $s$ we have
	\[ \frac{4642r}{2^{13}} \approx s. \]
	Now, we analyze the continued fraction of $\frac{4642}{2^{13}}$;
	we find the first few convergents are
	\[
		0, \;
		1, \;
		\half, \;
		\frac{4}{7}, \;
		\frac{13}{23}, \;
		\frac{17}{30}, \;
		\frac{1152}{2033}, \;
		\dots
	\]
	So $\frac{17}{30}$ is a good approximation,
	hence we deduce $s = 17$ and $r = 30$ as candidates.
	And indeed, one can check that $r = 30$ is the desired order.
\end{example}

This won't work all the time\footnote{%
	Not to mention the general issue of noise, but that's for
	engineers to worry about.
} (for example, we could get unlucky and
measure $j=0$, i.e.\ $s=0$, which would tell us no information at all).

But one can show that we succeed any time that \[ \gcd(s,r) = 1. \]
This happens at least $\frac{1}{\log r}$ of the time,
and since $r < M$ this means that given sufficiently many trials,
we will eventually extract the correct order $r$.
This is Shor's algorithm.
